Long-Distance Rhythmic Dependencies and their Application to Automatic Language Identification

نویسندگان

  • Joseph Tepperman
  • Emily Nava
چکیده

The perception of rhythmic differences among languages relies on varieties in periodicity within prominence groups. But the consensus in phonetic research on rhythm is that existing measures don’t capture true rhythm by that definition instead, they merely measure short-term timing. This work proposes a new rhythm measure, the Generalized Variability Index (GVI), that examines durational contexts over arbitrarily long linguistic distances. To evaluate this new measure, we conducted a set of experiments in automatic language identification using large amounts of data from 11 languages in the Globalphone and TIMIT corpora. When added to baseline rhythm measures, these new GVI features offer absolute improvement in 11-way language classification accuracy by as much as 12%. Moreover, the addition of wider and wider durational context in the GVI continues to contribute information useful for automatic language ID, abating in usefulness only at a distance of about 10 syllables.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Can Automatically Extracted Rhythmic Units Discriminate among Languages?

This paper deals with rhythmic modeling and its application to language identification. Beside phonetics and phonotactics, rhythm is actually one of the most promising features to be considered for language identification, but significant problems are unresolved for its modeling. In this paper, an algorithm dedicated to rhythmic segmentation is described. Experiments are performed on read speec...

متن کامل

The musical language Elements of Persian musical language: modes, rhythm and syntax

In treating the subject of musical language, a Persian musician would be intrinsically drawn to the structural similarities between the Persian music and language. Indeed Persian music and language are extremely related in their metrics, intonations and structural phrases (syntax). Although we will draw upon this relationship, our aim in this article is to present “music as a language,” c...

متن کامل

Kohonen Self Organizing for Automatic Identification of Cartographic Objects

Automatic identification and localization of cartographic objects in aerial and satellite images have gained increasing attention in recent years in digital photogrammetry and remote sensing. Although the automatic extraction of man made objects in essence is still an unresolved issue, the man made objects can be extracted from aerial photos and satellite images. Recently, the high-resolution s...

متن کامل

Online Processing of English Wh-Dependencies by Iranian EFL Learners

To be able to reach the level of ultimate attainment in the second language, learners need to acquire not only the grammar of the L2 but also the language processing mechanisms involved in the comprehension of sentences in real time. Contrary to its importance, very little is known yet about online L2 processing. This study examines whether advanced Iranian learners of English reactivate disloc...

متن کامل

A Fast Re-scoring Strategy to Capture Long-Distance Dependencies

A re-scoring strategy is proposed that makes it feasible to capture more long-distance dependencies in the natural language. Two pass strategies have become popular in a number of recognition tasks such as ASR (automatic speech recognition), MT (machine translation) and OCR (optical character recognition). The first pass typically applies a weak language model (n-grams) to a lattice and the sec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011